Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available December 15, 2025
-
Free, publicly-accessible full text available December 15, 2025
-
Storing tabular data to balance storage and query efficiency is a long-standing research question in the database community. In this work, we argue and show that a novel {\em DeepMapping} abstraction, which relies on the impressive {\em memorization} capabilities of deep neural networks, can provide better storage cost, better latency, and better run-time memory footprint, all at the same time. Such unique properties may benefit a broad class of use cases in capacity-limited devices. Our proposed DeepMapping abstraction transforms a dataset into multiple key-value mappings and constructs a multi-tasking neural network model that outputs the corresponding \textit{values} for a given input \textit{key}. To deal with memorization errors, DeepMapping couples the learned neural network with a lightweight auxiliary data structure capable of correcting mistakes. The auxiliary structure design further enables DeepMapping to efficiently deal with insertions, deletions, and updates even without retraining the mapping. We propose a multi-task search strategy for selecting the hybrid DeepMapping structures (including model architecture and auxiliary structure) with a desirable trade-off among memorization capacity, size, and efficiency. Extensive experiments with a real-world dataset, synthetic and benchmark datasets, including TPC-H and TPC-DS, demonstrated that the DeepMapping approach can better balance the retrieving speed and compression ratio against several cutting-edge competitors.more » « less
-
Storing tabular data to balance storage and query efficiency is a long-standing research question in the database community. In this work, we argue and show that a novel DeepMapping abstraction, which relies on the impressive memorization capabilities of deep neural networks, can provide better storage cost, better latency, and better run-time memory footprint, all at the same time. Such unique properties may benefit a broad class of use cases in capacity-limited devices. Our proposed DeepMapping abstraction transforms a dataset into multiple key-value mappings and constructs a multi-tasking neural network model that outputs the corresponding values for a given input key. To deal with memorization errors, DeepMapping couples the learned neural network with a lightweight auxiliary data structure capable of correcting mistakes. The auxiliary structure design further enables DeepMapping to efficiently deal with insertions, deletions, and updates even without retraining the mapping. We propose a multi-task search strategy for selecting the hybrid DeepMapping structures (including model architecture and auxiliary structure) with a desirable trade-off among memorization capacity, size, and efficiency. Extensive experiments with a real-world dataset, synthetic and benchmark datasets, including TPC-H and TPC-DS, demonstrated that the DeepMapping approach can better balance the retrieving speed and compression ratio against several cutting-edge competitors.more » « less
-
Successfully tackling many urgent challenges in socio-economically critical domains, such as public health and sustainability, requires a deeper understanding of causal relationships and interactions among a diverse spectrum of spatio-temporally distributed entities. In these applications, the ability to leverage spatio-temporal data to obtain causally based situational awareness and to develop informed forecasts to provide resilience at different scales is critical. While the promise of a causally grounded approach to these challenges is apparent, the core data technologies needed to achieve these are in the early stages and lack a framework to help realize their potential. In this article, we argue that there is an urgent need for a novel paradigm of spatio-causal research built on computational advances in spatio-temporal data and model integration, causal learning and discovery, large scale data- and model-driven simulations, emulations, and forecasting, as well as spatio-temporal data-driven and model-centric operational recommendations, and effective causally driven visualization and explanation. We thus provide a vision, and a road map, for spatio-causal situation awareness, forecasting, and planning.more » « less
An official website of the United States government

Full Text Available